{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# FMA: A Dataset For Music Analysis\n",
    "\n",
    "Michaël Defferrard, Kirell Benzi, Pierre Vandergheynst, Xavier Bresson, EPFL LTS2.\n",
    "\n",
    "## Free Music Archive web API\n",
    "\n",
    "All the data in the `raw_*.csv` tables was collected from the Free Music Archive [public API](https://freemusicarchive.org/api). With this notebook, you can:\n",
    "* reconstruct the original data, \n",
    "* update some fields, e.g. the `track listens` (play count),\n",
    "* augment the data with newer fields wich may have been introduced in their API,\n",
    "* update the dataset with new songs added to the archive.\n",
    "\n",
    "Notes:\n",
    "* You need a key to access the API, which you can [request online](https://freemusicarchive.org/api/agreement) and write into your `.env` file as a new line reading `FMA_KEY=MYPERSONALKEY`.\n",
    "* Requests take some hunderd milliseconds to complete."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import IPython.display as ipd\n",
    "import utils"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fma = utils.FreeMusicArchive(os.environ.get('FMA_KEY'))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1 Get recently added tracks\n",
    "\n",
    "* `track_id` are assigned in monotonically increasing order.\n",
    "* Tracks can be removed, so that number does not indicate the number of available tracks."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "for track_id, artist_name, date_created in zip(*fma.get_recent_tracks()):\n",
    "    print(track_id, date_created, artist_name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2 Get metadata about tracks, albums and artists\n",
    "\n",
    "Given IDs, we can get information about tracks, albums and artists. See the available fields in the [API documentation](https://freemusicarchive.org/api)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fma.get_track(track_id=2, fields=['track_title', 'track_date_created',\n",
    "                                  'track_duration', 'track_bit_rate',\n",
    "                                  'track_listens', 'track_interest', 'track_comments', 'track_favorites',\n",
    "                                  'artist_id', 'album_id'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fma.get_track_genres(track_id=20)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fma.get_album(album_id=1, fields=['album_title', 'album_tracks',\n",
    "                                  'album_listens', 'album_comments', 'album_favorites',\n",
    "                                  'album_date_created', 'album_date_released'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "fma.get_artist(artist_id=1, fields=['artist_name', 'artist_location',\n",
    "                                    'artist_comments', 'artist_favorites'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3 Get data, i.e. raw audio\n",
    "\n",
    "We can download the original audio as well. Tracks are provided by the archive as MP3 with various bit and sample rates."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "track_file = fma.get_track(2, 'track_file')\n",
    "fma.download_track(track_file, path='track.mp3')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4 Get genres\n",
    "\n",
    "Instead of compiling the genres of each track, we can get all the genres present on the archive with some API calls."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "genres = fma.get_all_genres()\n",
    "print('{} genres'.format(genres.shape[0]))\n",
    "genres[10:25]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "And look for genres related to Rock."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "genres[['Rock' in title for title in genres['genre_title']]]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "genres[genres['genre_parent_id'] == '12']"
   ]
  }
 ],
 "metadata": {},
 "nbformat": 4,
 "nbformat_minor": 2
}
